FROM SPEECH SOUNDS TO SYMBOLIC REPRESENTATION – ABOUT NEW APPROACHES TO IMPOSE STRUCTURE ON SPEECH DATA During the last decades, phonetic and phonological knowledge about the structure of speech

نویسندگان

  • Louis ten Bosch
  • Annika Hämäläinen
  • Bert Cranen
  • Lou Boves
چکیده

During the last decades, phonetic and phonological knowledge about the structure of speech has provided a basis for the development of computational models for speech recognition, including automatic speech recognition (ASR). Since speech sounds are highly variable, the robustness of the mapping from a speech signal to its discrete symbolic representation is one of the most difficult problems in speech science. The mainstream approach to 'sound-to-symbol' mapping is based on the use of a small set of phonetic-phonologically motivated 'speech units', in combination with a statistical description of these units (obtained on a large speech corpus). In this approach, speech is considered a process that can adequately be represented by a sequence of such speech units ('beads-on-a-string' paradigm, [1]). The performance of computational models of speech recognition has shown that this 'beads-on-a-string' paradigm works reasonably well for utterances that do not deviate much from the patterns included in the training corpus ([2]). However, it is becoming clear that this conventional data-driven approach has serious limitations. Despite the use of ever-larger speech corpora, the performance of computational models of speech recognition faces a ceiling effect and falls short of human performance by an order of magnitude, particularly due to its poor capability to cope with unseen test conditions. Recently, several researchers have suggested exploring radically new approaches to address the sound-to-symbol representation. A common factor in all these new approaches is the use of sophisticated models to better impose knowledge-based structure on raw speech data. The issue of using phonological and linguistic structure is central in several lines of current research: on the role of fine phonetic details in lexical decoding ([3]), on the relation between (symbolic) context and pronunciation variation ([4]), and on the design of computational models for human speech processing ([5]). In all these research directions, the combination of statistical data-driven techniques with phonetic-phonological structure is crucial for further improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Phonetic Explanations for Sound Patterns: Implications for Grammars of Competence

Phonological grammars try to represent speakers’ knowledge so that the ‘natural’ behavior of speech sounds becomes self-evident. Phonetic models have the same goals but have no psychological pretensions. Phonetic models succeed in explaining the natural behavior of speech, whereas phonological representations largely fail. The ‘phonetic naturalness’ requirement in phonological grammars should b...

متن کامل

مقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی

Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...

متن کامل

A New Phonetic Model for Continuous Speech Recognition Systems

The main goal of this work is to describe a new model for a large vocabulary continuous speech recognition system using a phonetic-phonological approach. This work proposes a statistical phonetic structure, applied at the phoneticphonological level, to improve the speech recognition performance in systems with phonetic-phonological modeling. It is showed that the general likelihood scores are i...

متن کامل

Investigating the formal effect of rear wall structure on acoustic parameters of speech halls (Research Article)

Referring to the rear wall in a hall is the furthest element rather than the voice source, therefor the reflections of this structural member play important role in music and speech intelligibly, especially for one-third behind audiences. Hence the form of these structures can be very effective in the acoustical quality of speech halls and auditoria. In this study, four formic structures are ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006